Handset-dependent background models for robust text-independent speaker recognition
نویسندگان
چکیده
This paper studies the e ects of handset distortion on telephone-based speaker recognition performance, resulting in the following observations: (1) the major factor in speaker recognition errors is whether the handset type (e.g., electret, carbon) is di erent across training and testing, not whether the telephone lines are mismatched, (2) the distribution of speaker recognition scores for true speakers is bimodal, with one mode dominated by matched handset tests and the other by mismatched handsets, (3) cohort-based normalization methods derive much of their performance gains from implicitly selecting cohorts trained with the same handset type as the claimant, and (4) utilizing a handset-dependent background model which is matched to the handset type of the claimant's training data sharpens and separates the true and false speaker score distributions. Results on the 1996 NIST Speaker Recognition Evaluation corpus show that using handset-matched background models reduces false acceptances (at a 10% miss rate) by more than 60% over previously reported (handset-independent) approaches.
منابع مشابه
Comparison of background normalization methods for text-independent speaker verification
This paper compares two approaches to background model representation for a text-independent speaker verification task using Gaussian mixture models. We compare speaker-dependent background speaker sets to the use of a universal, speaker-independent background model (UBM). For the UBM, we describe how Bayesian adaptation can be used to derive claimant speaker models, providing a structure leadi...
متن کاملRobust speaker verification insensitive to session-dependent utterance variation and handset-dependent distortion
This paper investigates a method of creating robust speaker models that are not sensitive to session-dependent (SD) utterance-variation and handset-dependent (HD) distortion for hidden Markov model (HMM)-based speaker veri cation systems in a real telephone network. We recently reported a method of creating session-independent (SI) speaker-HMMs that are not sensitive to SD utterance-variation. ...
متن کاملEnvironment adaptation for robust speaker verification
In speaker verification over public telephone networks, utterances can be obtained from different types of handsets. Different handsets may introduce different degrees of distortion to the speech signals. This paper attempts to combine a handset selector with (1) handset-specific transformations and (2) handset-dependent speaker models to reduce the effect caused by the acoustic distortion. Spe...
متن کاملNew distance measures for text-independent speaker identification
Distance measures [1][2][3] based on the covariance matrix of feature vectors were applied to text-independent speaker verification and identification. However, some of them do not satisfy the symmetric property which is fundamental to a distance measure. In this paper, we propose several symmetric distance measures based on the covariance matrix of feature vectors, and then construct some adva...
متن کاملEnvironment adaptation for robust speaker verification by cascading maximum likelihood linear regression and reinforced learning
In speaker verification over public telephone networks, utterances can be obtained from different types of handsets. Different handsets may introduce different degrees of distortion to the speech signals. This paper attempts to combine a handset selector with (1) handset-specific transformations, (2) reinforced learning, and (3) stochastic feature transformation to reduce the effect caused by t...
متن کامل